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Abstract 

We provide a novel achievability proof of the Slepian-Wolf theorem for i.i.d. sources over finite alphabets. We 
demonstrate that random codes that are linear over the real field achieve the classical Slepian-Wolf rate-region. 
For finite alphabets we show that typicality decoding is equivalent to solving an integer program. Minimum entropy 
decoding is also shown to achieve exponentially small probability of error. The techniques used may be of independent 
interest for code design for a wide class of information theory problems, and for the field of compressed sensing. 

I. Introduction 

A well-known result by Slepian and Wolf in [2] characterizes the rate-region for near-lossless source coding of 
distributed sources. The result demonstrates that if two (or more) sources possess correlated data, even independent 
encoding of the sources' data can still achieve essentially the same performance as when the sources encode 
jointly. This result has important implications for information theoretic problems as diverse as sensor networks [3], 
secrecy [4], and low-complexity video encoding [5]. Unfortunately for the distributed source coding problem, codes 
that are provably both rate-optimal and computationally efficient to implement are hard to come by. Section HIl gives 
a partial history of results for the Slepian-Wolf (SW) problem. 

In this work we provide novel codes that asymptotically achieve the SW rate-region with vanishing probabiUty 
of error. Our encoding procedure comprises of random linear operations over the real field M, and are hence called 
Real Slepian-Wolf Codes or RSWCs. In contrast most other codes in the literature operate over appropriate finite 
fields ¥q. We demonstrate that RSWCs can be used in a way that enables the receiver to decode the sources' 
information by solving a set of integer programs (IPs). Besides being interesting in their own right as a new class 
of codes achieving the SW rate-region, the relation between RSWCs and IPs has some intriguing imphcations. 

In general IPs are computationally intractable to solve. However, our code design gives us significant flexibility 
in choosing the particular IPs corresponding to our codes. That is, we show that "almost all" RSWCs result in 
IPs that have "good" performance for the SW problem. But there are well-studied classes of IPs that are known 
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to be computationally tractable to solve (for e.g., IPs corresponding to Totally Unimodular matrices [6]). It is thus 
conceivable that suitably chosen RSWCs may be decodable with low computational complexity. 

Linear SW codes over finite fields were introduced in [7] and they were shown to achieve the SW rate-region. 
Decoding such codes is equivalent to finding a vertex of a hypercube satisfying some combinatorial properties. 
Such problems are computationally intractable. Our SW codes are linear over R. Though decoding our codes may 
still be difficult, we can use tools from the matured field of convex optimization for decoding our codes. 

Also, our work has direct implications for the new field of Compressed Sensing (CS). In the CS setup, N sources 
each generate a single real number. The resulting length-iV sequence is k-sparse, i.e., can be written with at most 
A: <C iV non-zero coefficients in a prespecified basis. A typical result [8] in this setup shows that if a receiver gets 
0{klog{N)) random linear combinations over M of the sources' sequence, it can, with high probability, reconstruct 
the source sequence exactly in a computationally efficient manner by solving a linear program. The CS setup is 
quite similar to that of the RSWCs we design - the source sequence contains a large amount of redundancy, and 
a random M-linear mixture of the sequence suffices for exact reconstruction via optimization techniques. There 
are, however, two major differences. First, RSWCs operate at information-theoretically optimal rates whereas CS 
codes are bounded away from such performance. Second, CS codes are computationally tractable, whereas we are 
currently not aware of efficient decoding techniques for RSWCs. We think this tradeoff between computational 
efficiency and rate-optimality is interesting and worthy of further investigation. 

In Section HIl we discuss some background and tools to be used in the subsequent sections. In Section Hill we 
present the construction of our RSWCs and the related main results. These results are then proved in Sections |IV] 
and [V] In Section IVII we present the direct construction of RSWCs for any point on the Slepian-Wolf rate-region 
without time-sharing between the corner points. The universal minimum-entropy decoding algorithm is shown to 
work for our RSWCs in Section IVIII Section IVIIII shows that our RSWCs achieve the rate-region of more general 
normal source networks without helpers introduced in [9]. Finally Section |IX] concludes the paper. 

II. Background and Definitions 

Shannon's seminal source coding theorem [10] demonstrates that a sequence of discrete random variables can 
essentially be compressed down to the entropy of the underlying probability distribution generating the sequence. 
Of the many extensions sparked by this paper, the Slepian-Wolf theorem [2] is the one this paper builds on. 

A. Slepian Wolf Theorem for i.i.d. sources [2] 

Problem Statement: Two sources named Xavier and Yvonne generate two sequences of discrete random variables, 
X = Xi, X2, ■ ■ ■ , Xn over the finite alphabet X, and Y = Yi, I2, ■ • ■ , i^n over the finite alphabet 3^, respectively. 
The sequence (X, Y) is assumed to be i.i.d. with a joint distribution px.y{x, y) that is known in advance to both 
Xavier and Yvonne. The corresponding marginal distributions over X and Y are denoted by px{x) and pviy) 
respectively. Xavier and Yvonne wish to communicate (X, Y) to a receiver Zorba. To this end Xavier uses his 
encoder to transmit a message that is a function only of X and px,Y{x,y) to Zorba. Similarly, Yvonne uses her 
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encoder to transmit a message that is a function only of Y and px,Y{x,y) to Zorba. Zorba uses his decoder to 
attempt to reconstruct (X, Y). Xavier and Yvonne's encoders and Zorba's decoder comprise a SW code C. The 
SW code C is said to be near-lossless if Zorba's reconstruction of (X, Y) is correct with a probabiUty of error over 
Pxy{x, y) that is asymptotically negligible in the block-length n. The rate-pair {Rx,Ry) is said to be achievable 
for the SW problem if for every e > there exists a code C that is near-lossless, and the average (over px,y{x, y)) 
number of bits that C requires Xavier and Yvorme to transmit to Zorba are at most n{Rx + e) and n(i?y + e) 
respectively. The set of aU rate-pairs that are achievable is called the rate-region. Slepian and Wolf's characterization 
of the rate-region is remarkably clean. 

Theorem 1: [2] The rate-region for the Slepian- Wolf problem is given by the intersection of 

Rx > H{X\Y), 

Ry > H{Y\X), (1) 
Rx + Ry > H{X,Y). 

Here H{X\Y) and H{Y\X) denote the conditional entropy and H{X,Y) denotes the joint entropy of {X,Y) 
(impUcitly, over the joint distribution px,y{x, y)). 

B. Linear SW codes over finite fields 

The SW codes in [2] have computational complexity that is exponential for both encoding and decoding. An 
improvement was made in [7], where it was shown that random linear encoders suffice. We briefly restate that 
result here, restricting ourselves to the case when X = y = {0,1} for simplicity. 

Let Dx and Dy be respectively \n{Rx + e)] x n and \n{RY + e)] x n matrices over the finite field F2, with each 
entry of both matrices chosen i.i.d. as either or 1 with probabiUty 1/2. Here e is an arbitrary positive constant. 
Abusing notation, let X and Y also denote length-n column vectors over F2. Xavier and Yvonne's encoders are then 
defined respectively via the matrix multiplications DxX and DyY, and their messages to Zorba are respectively 
the resulting column vectors. 

We now define Zorba's decoder. For an arbitrary distribution px,Y{x,y) over finite alphabets, let the strongly 
e-jointly typical set A^l^^^^ [H] (henceforth simply caUed the typical set) be the set of all length-n sequences 
(X, Y) such that the empirical distribution induced by (X, Y) differs component- wise from px,y{x, y) by at most 
e/{\X\\y\). That is, 

where -/V(x,y) («j b) denotes the number of component pairs {xi, yi) in (x, y) which are equal to (a, b). For simplicity 
of notation we denote ^"p^ y asA^. Zorba checks to see if there exists a unique length-n sequence (X, Y) satisfying 
two conditions. First, that DxX and DyY respectively match the messages transmitted by Xavier and Yvonne. 
Second, whether (X, Y) Ues within A^. If both conditions are satisfied for exactly one sequence (X, Y), Zorba 
outputs (X, Y), else he declares a decoding error. 



^(x,y)(a,^') 



■Px,Y{a,b) 



< 



\x\\y\ 



for every {a,b) G X x y 



October 8, 2008 



DRAFT 



DEY, JAGGI, AND LANGBERG: "REAL" SLEPIAN-WOLF CODES 



4 



Then [7] shows the following result. 

Theorem 2: [7] For each rate pair (Rx,By) in the region defined by ([T]l and sufficiently large n, with high 
probability over choices of Dx and Dy the corresponding SW code is near-lossless. 

Many of the SW codes in the literature build on such encoders that are linear over a finite field. Some such codes 
use iteratively decodable channel codes to attain performance that is empirically "good", but performance guarantees 
have not been proven (e.g. [12]). Other codes use recent theoretical advances in channel codes to produce near- 
lossless codes that achieve any point in the SW rate-region, but cannot give guarantees on computational complexity 
(e.g. [13]). 

C. Linear codes over real fields 

As mentioned in the introduction. Compressed Sensing codes operate over real (and complex) fields, and are 
structurally similar to the codes proposed in this work. The primary difference between the two sets of results is that 
our focus is on achieving information-theoretically optimal performance (at the cost of potentially high decoding 
complexity), whereas CS codes have lower decoding complexity at the cost of non-optimal rates. Some intriguing 
results on CS codes can be found in [14], [8]. 

Concurrently, codes over the real field R also seem to have applications for the channel coding problem. Using 
significantly different techniques, Tao et al. [15] obtained channel codes that can be decoded solving a linear 
program (LP). Also, lattice codes have been shown to achieve capacity for the AWGN channel [16]. 

III. RSWC Model 

As is common in the SW literature [11], we focus on just the point {H{X), H{Y\X)) in the SW rate-region. 
Time-sharing between this and the symmetric point {H{X\Y), H{Y)) enables us to achieve all points in the rate- 
region. Thus Xavier encodes his data X using a classical lossless source code, and Zorba decodes it losslessly. 
We henceforth discuss only Yvonne's RSWC encoder for Y and Zorba's corresponding decoder In Section [VTl we 
show how to generalize our proof techniques to get codes that achieve any point in the SW rate-region without 
time-sharing. We consider only X and y that are ordered finite subsets of M. 

RSWC Encoder: We define an encoding matrix D. Here to is a code-design parameter to be specified later, 

and D is chosen as follows. Each component Dij of D is chosen randomly from a finite set V. More precisely, each 
element of D is chosen i.i.d. from T) according to a distribution pn. The set T) can be any arbitrary finite subset 
of M, and the distribution pD can be chosen arbitrarily on T), as long as the probability of at least two elements 
of V is non-zero. For ease of proof, we assume that is zero-mean - the more general case requires only small 
changes in the proof details. The particular values of T) and pd can be chosen according to the application. We 
denote the i-th row of D by D^. 

For a fixed block-length n, Yvonne's data is arranged as a column vector Y = (Yi, I2, ■ • ■ , Yn)^ ■ To encode, Y is 
multiplied by D to get a length-m real vector U = DY. We denote the real interval (— 71°^+*^) by Iq. Each 
component Ui of U is uniformly quantized by dividing Iq into steps of size A„ — 2n~^. Thus [(0.5 + 2e) logn] 
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bits suffice for this quantization. Note that the values outside the range Ig are quantized to the farthest quantization 
levels from origin. Here and throughout the paper log(.) denotes the binary logarithm, and e is a code-design 
parameter that can be used to trade off between the probability of error and the rate of the RSWC. It can be chosen 
as any arbitrarily small positive real number The quantized value of Ui is denoted by Ui and the corresponding 
length-m quantized vector is denoted by U. We take m = \{n{H{Y\X) + 3e))/(0.5 logn)] since then Yvonne's 
encoder will encode at about H{Y\X) bits per symbol. Thus the total number of bits Yvonne transmits to Zorba 
equals m[(0.5 + 2e) logn], which for all sufficiently large n can be bounded from above by nH{Y\X) + pen for 
a universal constant p. 

RSWC Decoder: Zorba first decodes X = x. Suppose he received U = u from Y. He finds a vector y which is 
strongly e-jointly typical with x, and for which Dy = u. If there is no such y or there is more than one such y 
he declares a decoding error. 

The ensemble of RSWC encoder-decoder pairs described above is denoted by C(e, n,px,Y ,Pd)- The probability 
of error ofC{e, n,px,Y ,Pd) is defined as the probability over px,y and that Zorba makes or declares a decoding 
error. The rate of C{e,n,px,Y,PD) is defined as the number of bits that Yvonne transmits to Zorba. 

We are now in a position to state and prove our main results. The proofs of these results are presented in the next 
two sections. Theorem [3] shows that our RSWCs achieve the corner point {H{X), H{Y\X)) in the Slepian-Wolf 
rate-region with exponentially small probability of error. 

Theorem 3: For all sufficiently large n there are universal positive constants c, p, such that the probability of 
error under typicality decoding and rate of C(e, n,px.Y tPd) are at most 2^^"/^°s" and H{Y\X) + pe respectively. 

We next show that Yvonne's decoding can be done by solving an IR 

Theorem 4: If Yvonne's source is binary, then the typicality decoding of a RSWC for the point {H{X), H{Y\X)) 
is equivalent to solving an IP. 

Further, we show that even for discrete memoryless sources over larger alphabet y, the encoder can be imple- 
mented as a series of RSWC encoders each of which is for a derived binary source. Then the typicality decoder 
can be implemented as a series of decoders each of which is equivalent to solving an IR 

Theorem 5: For any finite alphabet 3^, the real SW encoding can be done using — 1 RSWC encoders so that 
the typicality decoder can be implemented by solving |3^| — 1 IPs. 

For any rate-pair in the Slepian-Wolf rate-region, a direct construction of the individual RSWC encoders for 
Xavier and Yvonne without time-sharing between the corner points is presented in Section [Vll It is shown that 
RSWCs constructed this way also achieve the Slepian-Wolf rate-region. 

Theorem 6: Any point in the Slepian-Wolf rate-region can be achieved directly by RSWCs without time-sharing. 

We also show that RSWCs can be decoded by minimum entropy decoding. 

Theorem 7: For all sufficiently large n there are universal positive constants c, p, such that the probability of 
error under minimum entropy decoding and rate of C{e,n,px,Y,PD) are at most 2"™/'°s" and H{Y\X) + pe 
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respectively. 

It is argued in Section IVIIII that the achievable rate-region of the more general class of source networks known 
as normal source networks without helpers [9] is also achieved by our RSWCs. 

Theorem 8: Random RSWCs achieve the rate region of any normal source network without helpers. 

The above results will be proved in the subsequent sections. In the rest of the paper, for simplicity of exposition 
many different constants, independent of n, wiU be denoted by the same symbol "c". 

IV. Proof of Theorem[3] 
The probability of decoding error is given by 

Pe <Pl+P2 (2) 

where Pi is the probabiUty that (X, Y) are not strongly jointly e-typical, and P2 is the probability that (X, Y) G A^, 
but there is another y' 7^ y such that (X, y') G A^, and DY = Dy'. 

Bounding Pi: For Pi, note that for any non-typical sequence (x,y), its type P(x.y) satisfies \px,Y — P(x,y)|i > 
e/|A'||3^|. So, using D{px,Y\\P{x,y)) > \px,y — P(x,y)li/(21n2) [11, Lemma 12.6.1] and Sanov's theorem [11, 
Theorem 12.4.1], we have 



Pi < (n+ 1)1-^11^1 exp f^- 



< 2-""- (3) 



for some positive constant c. The rest of this section focuses on bounding P2 in (|2]). 

Lemma 6 Berry-Esseen Theorem 
Lemma 7 Lemma 8 



Lemma 9 Sanov's Theorem 



i 



Lemma 10 Eq. (3) 



Theorem 3 

Fig. 1 . Dependence stracture of Lemmas 



Bounding P2: In the following, we present a sequence of lemmas leading to Lemma [121 which gives a bound 
on P2. A dependency "graph" of lemmas is shown in Fig. [T]to ease understanding. We start by a general lemma 
proved in the Appendix. 
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Lemma 9: Let Wi, W2, • • • , Wn be a sequence of i.i.d. zero-mean random variables taking values from W, and 



A 



a = max{|?ii| Iw g W}. Then for any positive constant A, 



Pr ■ 



1=1 



> 



A|<2(„ + ir,e.p(-^) 



We now show some properties of our quantization of Ui = D^Y. 
Lemma 10: There exists a positive constant c so that for any y e y^' 



Pr{|D,;y| > < 2" 



Proof: Let j/max be the element in y with maximum absolute value. For any y £ y, let Sy be the set of indices 
i such that ^ y, i.e., Sy ^ {j\yj = y}. If |Diy| = J2yey (Ejgs^ ^y^j) > *en for at least one y, 

IE,e5„A,2/,l>(l/|3^lK■^+^ So, 

Pr{|D,y| > n0-5+^} 

1 



< Pr 



jeSy 



> jjjj'"''^^^'^ ^^^^^ one y 



< 



E ^'ly^ 

jeSy 



Ea. 



1 



0.5+e 



> 



,0.5+c 



l^llsl 



< g{2(|5,Kl,-„p(-^^„. 



+2e 



< ^ 2(n + 1)1^1 exp(- 



< 1:^712(71 + 1)1^1 exp 



y\ 

l+2e 



2na2|3;|2|y,^^^|2 

^2e 



(4) 
(5) 



for some constant c, for large enough rt, and where a — ma.x{\d\\d E T)}. Here (HJl follows from Lemma |9j and 
([5) follows from \Sy \ <n and ly^axl > |y| Vy e 3^. □ 

The following lemma gives, for two different y,y' G 3^", an upper bound on the probability that D,:y = D,:y'. 

Let p± denote the minimum of Pr{Dij > 0} and Pr{Dij < 0}. Since Dij has zero mean and has at least two 
symbols with non-zero probability, it follows that p± 7^ 0. 

Lemma 11: If y G 3^" and y' e 3^" differ in t components then 

c 



Pr{ I D,(y - y')| < A„} < min 1 - p± 



for some fixed constant c G M. 
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Proof: Let by be the smallest difference in y, i.e., by = "cam.yi,y2£y,yi^y2 \yi ~ V2\- We denote the j-th component 
(yj — y'j) of y — y' by aj. Then there are t nonzero aj, and w.l.o.g., we assume that ai, 0:2, . . . , a* ^ 0. Note that 
\{y ~ y'lv^y' &y,y y'}\ ^ |3^P- So there are at least r ^ t/|3^P elements among ai,a2, ■ ■ ■ ,at which are the 
same. Let us assume, w.l.o.g., that ai ^ a-z = ■ ■ ■ = ar- Let cr^ be the variance of Dij. Then the random variables 
Vi — aiDii, V2 — a2Di2, . . . ,Vt = ctrDir are i.i.d. with zero mean and variance a'^ — jai pcr^. The central limit 
theorem states that the distribution of the normalized sum WV = ^3 1 W V^) approaches the normal A/^(0, 1) 

distribution as r increases. The Berry-Esseen theorem [17] gives a uniform upper bound on the deviation of the 
cumulative distribution function (cdf) of Wr from the cdf of A/^(0, 1). The Berry-Esseen bound is given by 

\Pr{Wr <w}- $H| < (6) 

for any w e R. Here 7 = is the third moment of Vi, and /5 is a universal constant whose value has been 

improved over the decades. We use the Berry-Esseen bound to prove the lemma as below. 

Pr{|D,(y-y')| < A„} 



Pr{-A„ < D,(y-y') < A„} 



\ai\a^ \ai\(j^ \ai\ay/T 



aby^ |ai|t7^ aby^/r 



\ai\a^ abyy/r \ai\cr^ \ai\a^ aby^ \ 

^ f E;u+iAj(j/j -?/;) A„ ^^^^^^ ^ A„ \ 



aby^jT\J2'K a' y/T 



< , r-^ + 2 X (8) 



/27r o"6„ 



Eq. (O follows by using the Berry-Esseen bound ^ on the normalized sum WV- The first term 2 : 
dHJ is an upper bound on the probability of M(0, 1) lying in the interval of length 2 x /-^"^ in (|7]l. This bound is 
obtained by multiplying the maximum value l/y/2n of the probability density function of Af{0, 1) by the length of 
the interval. The deviation of the cdf of Wr from that of M{0, 1) at each boundary point of the interval is bounded 
by the Berry-Esseen bound. The second term in (O is the sum of this bound at these two boundary points. 

For < > 0, there is at least one j such that yj 7^ y'j. Let us assume, w.l.o.g., that yi ^ y[. For large enough n, 
A„ <by X mindg-D^d^o Ml- So, 

Pr{|D,(y-y')|<A„}<l-p±. 
This can be easily checked by considering the change in the value from J2"^2 ^ijiyj ~ Vj) Di(y — y'). □ 
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Lemma 12: Let y and y' be any two vectors differing in t components. Then for some constant c and a constant 
p <1, both independent of y and y', we have 

Pr{Diy Diy'} < min ^p, 

for large enough n. 
Proof: 

Pr{D;j. = D^?} 

< FrjD^y = D;?||D,y| < |D,y'| < + Pr{|D,y| > + Pr{|D,y'| > 

< Pr{|D,(y - y')| < A„} + Pr{|D,y| > n°-^+^} + Pr{|D,y'| > tiO-^+^I 

<min(^l-p±,^^ +2(2-™") (9) 

for large enough n. The second term in (|9]l is obtained by applying Lemma |9] on the last two terms in the previous 
line. For any constant c' > c, we have + 2(2~™^') < for large enough n. Also, for any p > 1 — p±, 
I — p± + 2(2-™ ) < p for large enough n. So the result follows. □ 

We are now ready to present an upper bound on P2- 

Lemma 13: For large enough n, 

P2<2-™/^°s", (10) 

where c is a constant. 
Proof: 



< 



J2 Px,y(x,y)Fr{3yVys. t. D?' = D^,(x,y') ev4,} 

(x,y)e A, 

< Px.Y{^,y) Pr{'D^'^^} (11) 

(x,y)eA, y'#y 

(x,y') eA, 

= J2 Px,Yi^,y)J2 E {Pr{^' = -Dry}y (12) 

(x,y)GA. t>0 (x,y')eA, 

iiff(y,y')=* 

■,y)eA, t>o {x,y')eA, ^ ^ V-// 

rfff(y,y')=* 

= Y Pxy(x,y)^iVx,y(i) (min(p,^)) (14) 

where N^.yit) is the number of y' which are jointly typical with x and which are at Hamming distance t from 
y, i.e., A^x,y(i) ^ |{y' e 3^"|(x,y') e A,,dH(y,y') = 01- Eq. (HB follows by union bound, Eq. O follows 
because the rows of D are i.i.d., and Eq. (fTST i follows from Lemma [T2l For t > 0, let A^(t) denote the maximum of 
Nx,y{t) over all possible typical (x, y) pairs, i.e., N{t) = maxjx.yjg^^ -^x,y(i)- Further, let t„ denote the value of 
t for which the expression inside the second summation in (O takes the maximum value for some typical (x, y). 
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i.e., t„ = argmax4>o (^N{t) (min (^p,c/Vi))"^^- The subscript in t„ is to emphasize that it is a function of n. 
Then by substituting in (fl4] i. 

We emphasize here that every appearance of "c" may denote a different constant in the following. 

For any 6 < e/2{HiY\X) + 3e), we consider two regimes: (1) t„ > m}^^ and (2) t„ < n}"^ . In the first 
regime, we use the bounds N{tn) < 2"(^(^l-^)+2^) [11, Theorem 14.2.2], Pr{'D~y 7^ 'D~^'} < cjsft, and 
TO — \(n(Jl{Y\X') + 3e))/(0.51ogn)] to get, for large enough n, 

log(P2) < logn + log A^(t„) - ^ ^ ((0.5 - 0.5J) logn - logc) 

0.5 logn 

= n(if(r|X) + 2e)-n(i/(r|X) + 3e)(l-^)+n^ffil^^i^c + logn (15) 

U.5 logn 

\ 0.5 log n n 

Now, using 8 < e/2{H{Y\X) + 3e) and {c{H{Y\X) + 3e)/0.51ogn + (logn)/n) < e/4 for sufficiently large n, 
we get 

log(P.) < -f + f 

- -f ■ <>« 

In the regime t„ < n'^-\ we use the bounds N(tn) < {\y\ - 1)*- (") < (U|7i)*", and Pr{D^ 7^ D^y'} < p to 
get 

log(P2) < logn + t„logn + t„log|:y|-^^ffll^l±^logfi 

< logn + 7ii-'^'logn + 7ii-''log|:y| - (17) 

logn 

where c = {H{Y\X) + 3e)log(l/p). For large enough n, (logn)^ < cn'-^/^^/3 => logn < cn'^^/^'' /{3log{n)). 
Also, for large enough n, n~'^log|3^| < c/(31og(n)) for some constant c. So, for some constant c', 

c'n 



log(P2) < logn 



31ogn 



cn 

< (18) 

logn 

for large enough n and for some constant c. 

Since cn/ logn < ne/4 for large enough n, the result follows by combining (fTSI l and (fTsT l. □ 
From (l2]i, ([3j, and (fTOl i, we have, for large enough n, 

Pe <Pl+P2< < 2-™/'°Sn^ 

for a constant c, thus completing the proof of Theorem |3] 
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V. Proof of Theorem|4]and Theorem[5] 

We first show that for y — {0, 1} the typicality decoding of our scheme can be done via the solution of an IP. 
Recall that for a vector y, we defined, for any y E y, Sy ~ {i|yi = y}. Similarly with abuse of notation, for any 
vector X = (a;i, .... Xn) decoded by Zorba, and x £ X, let us define Sx = {i\xi = x}. The constraint (x, y) e Ae 
can be written as the linear constraints 

Moreover, the constraints Dy = u = (-ui, . . . , iin) can be written as 

- A„/2 < D,y < u., + A„/2, Vi = 1, . . . m. 

Finally we add the 'integraUty' constraints, namely, that y e y^\ 

For arbitrary finite alphabets y, Yvonne and Zorba perform |3^| — 1 encoding and decoding stages, each of which 
involves IP decoding of a binary vector A sketch follows. 

Let y''^\ . . . , yd-^l) denote the distinct values of y. In the first stage, instead of encoding y directly, Yvonne uses 
C{e,n,p\^ ytPd) to encode the vector /^(y). Here the vector /^(y) equals 1 in the locations that y equals yf^^^ 
and equals otherwise, and y is the corresponding induced distribution Pxj^{Y) defined on A" x {0, 1}. Since 
/^(y) is a binary vector, Zorba can use the IP decoding described above, and therefore can retrieve the locations 
where y equals y^-^\ Inductively, in the ith stage, Yvonne uses C{e,72{i),Px yjPd) to encode the vector /*(y). 
Here n{i) equals the number of locations whose values are still undetermined before the i*'* stage, i.e., n{i) equals 
|{i|yj > y'''}!- The length-n(i) vector /*(y) is obtained by first throwing away the locations in /*^^(y) that 
equalled 1, and then marking the remaining locations 1 if and only if the corresponding locations in y equal y'-^K 
At each stage, Zorba can use the IP decoding described above, and therefore can retrieve the locations where y 
equals ?/^*\ Let P{Y) denote the corresponding binary random variable s. t. {X, has the joint distribution 

given by p^ y(a;,l) = Pr{X = x,Y = y^'^\Y ^ y(^^\y(-^\ . . . ,y^'-^^, and pyy(a;,0) = Pr{X = x,Y ^ 
^ y(2)^ ^ y''^^-*}- Then by a direct extension of the grouping axiom [18, Page 8], we have 

H{Y\X) = H{f\Y)\X) + {l-pY{y^'^))H{f\Y)\X) + {1 - pviy^''^) - PY{y^^^^))H{f{Y)\X) + . . . 

+ [py{yi\y\-^)) +py{yi\y\)))H{f\y\-\Y)\X). (19) 

Clearly, for a single stage encoding/decoding, the average codelength for Yvonne is bounded by nH{Y\X) + cen. 
For a multi-stage encoding/decoding as described above, for a typical y, the block length at the i-th stage is bounded 
by n{i) < n{l — Pr{Y G {y^^\y''^\ . . . ,2/*^*^^'}} + e) and so the codelength is bounded as 

L,; < n{l^Pr{Y e{y^^\y^^\...,y^''^^} + e)H{r{Y)\X) + c,en 

for some constants c^. The average codelength is thus bounded using ( fT9] l by 

13^1-1 

L< ^ L,< nH{Y\X) + cen (20) 

i=l 
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for some constant c. If y is not typical, then in the worst case, the codelength n{i) — n for each i. Then the overall 
codelength is bounded by L < c'n for some constant c'. Since the probability of the non-typical set is exponentially 
small, the overall average codelength is still bounded by (|20] | for some constant c. Hence the overall rate of this 
multistage RSWC differs from H{Y\X) by at most ce, where c is some constant dependent only on px,Y- 
The overall probabihty of error can be bounded as 

1^1 

Pe" < A+^P2,z, (21) 

1=1 

where Pi is the probability that the vector y is not strongly typical, and P2, j is the conditional probability of error 
at the i-th stage of decoding given that the vector y is strongly e-typical and the decoding till the [i — l)-th stage 
is correct. If y is strongly e-typical, then the codelength at the i-th stage is n{i) > nx X)j=i(^^(2^''*') ^ — 
„(P^(y(li'l))-e/|3^|). So, 

c'n{i) 



< exp 

< exp 

< exp I — 



log(n(i)) 
c'n(i) 



\ogn 

^{PYm)~^/\y\)r 



logn 

Since Pi is also exponentially small, the overall probability of error for the multistage encoding/decoding is bounded 



< exp 



log 71 



□ 



VI. Real SW coding without timesharing 

Any rate-pair in the SW rate-region can also be directly achieved by RSWCs without timesharing between the 
schemes achieving the rate-pairs {H {X\Y) , H {Y)) and {H{X), H{Y\X)). Let (^1,^2) be a rate-pair in the SW 
rate-region. Let mi = \{n{Ri + 3e))/(0.5 logn)] and 7712 = [(n(i?2 + 3e))/(0.51og?T,)]. Similar to the encoding 
scheme of Yvonne described in Section III, Xavier chooses an ?Tii x n encoder matrix Di over T) according 
to a distribution Pq. Similarly Yvonne chooses a random 7712 x 77 encoder matrix D2 over T) according to the 
distribution Pd Q. Xavier encodes the length-Ti vector X by quantizing each component of Ui = DiX uniformly 
in the range Iq with step-size A„ = 277^*^ to obtain the vector Ui. Similarly, Yvonne encodes the length-77 vector 
Y by quantizing each component of U2 = D2Y uniformly in the range Iq with step size A„ = 271^*^ to obtain 
the vector U2. 

'Our arguments go through even if the elements of Di and D2 are chosen from different sets ©i and D2 according to some distributions. 
We restrict to Di = I?2 and the same distribution for the elements of Di and D2 for simplicity. 
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Zorba finds a unique jointly strongly e-typical pair (x, y) so that IDix = Ui and i>2y ~ U2- If there is no such 
pair, or if there are more than one such pair, then the decoder declares an error. The probability of error can be 
bounded as 

Pe" < A+P2I+P22+P23, (22) 

where Pi, as before, is the probability that (X, Y) is not jointly strongly e-typical, P21 is the probability that there 
is a x' 7^ X which is also jointly strongly e-typical with Y and Dix' = Ui, P22 is the probabihty that there is a 
y' 7^ Y which is also jointly strongly e-typical with X and D2y' = U2, and P23 is the probability that there is 
another jointly typical pair (x',y') so that x' 7^ X,y' 7^ Y, Dix' = Ui and D2y' = U2. We now investigate all 
the terms in ( l22b . 

Let JDi.i and D2,i denote the z-th rows of the matrices Di and D2 respectively. Similarly as Lemma fT2l we 
have 

Pr{D^ = dT~J'}, Pr{D^ = D^'} < min (p. 



Vi 



when each pair x, x' e A"" and y, y' e 3^" differ in t positions. 
We define the following functions. 



n(i? + 3e) 



0.5 logn 



and 



ML, R,S)^ log (n(Ln)""'(p)"(«)). 
Note that in this notation, P2 in Lemma [13] is given by 

log(P2) < MH{Y\X),H{Y\X),S) (23) 

for tn > n^^^ (See (fTTt). As shown in ( fT6] l, this is at most — rie/4 for S < e/2{H{Y\X) + 3e) for large enough n. 
It can be checked similarly that for S < e/2(i? + 3e), R> h, and large enough n, (j)i{h, R, 6) < —n{{R— /i) + e/4). 
Likewise, for i„ < n^^^ , it is shown (See ( fTTI l) that 

\og{P2)<h{\ylH{Y\X),5), (24) 

which is at most — cn/ log n (See ([TS])). More generally, it can be similarly proved that for any constants L > Q 
and (5 > 0, 

cj)2[L,R,d) < 

logn 

for some constant c{R, e) > and for large enough n. 
By definition. 



P22 = 

(x,y)eA 



Px,y{^, y)Pr {3/ 76 y s. t. D?' = D^, (x, y') e A,] 
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By similar arguments to those in the proof of Lemma [TJl we have log(P22) < <p2{\y\, R2, S) for tn < ^, and 
log(P22) < (l)i{H{Y\X),R2, S) for t,, > n^^^. Since R2 > H{Y\X), it follows that for large enough n, 

l0g(P22) < (25) 



Similarly, for large enough n. 



log(P2i) < (26) 
logn 



As in the proof of Lemma [T3] P23 can be simplified to ( |27] | below, 

P23 - J2 Px,y(x,y)Pr{3(x',y')s. t. xVx,yVy,Di;^'=D;x,D;y' = D;y,(x',y') eA,} 

(x,y)eA, 



(x',y')GA, 

(x,y)eA, ti>0,t2>0 (x\y')eA, 

dff(x,x')=ti, djf (y,y')=t2 

, ^ , , , ^ / . — . . — . \rn{Ri) / _ — ^ \m{R2) 

Y Px.Y{^,y) Y Y (Pr{Di,ix' = Di,ix}j [Pr{D2,iy' - D2.iy}j 

(x,y)GA^ tl>0,t2>0 (x',y')eA, 

djf (x,x')=ti, dff (y,y')=t2 

< Y Px.Yi^,y) Y Y "'^'^ min 



(x,y)eA, ti>0,t2>0 (x',y')eA, 

dff (x,x')=ti, dff(y,y')=*2 



"12 ^ 

2 > 



= Y Px.y{^,y) E ^^x,y(ii,<2)Qr*''^^Q2''''^^- (27) 

(x,y)6A, ti,t2>0 

In (IZTI i. Qi ~ min (p, c/^/ti), Q2 = min (p, c/y^), and N^^y{ti,t2) is the number of jointly typical (x', y') pairs 
such that x' differs from x at ti locations and y' differs from y at t2 locations, that is, A^x,y(^i, ^2) = |{(x', y') G 
A-" X y"|(x',y') e Ae,dH(x,x') = ti,d^f(y,y') = tall- We define N{tiM) = max^.y A^(x,y) eA. (^i, ^2), and 
(ii,n,i2,n) as the pair (ti,t2) that maximizes {N{ti,t2)Q'^^Q'2'^), that is, (ti,„,i2,n) = argmax4i_t2>o {N{ti,t2)Qf^Q 
Then 

P23<"'^(il,„,i2,n)grQr- 

For (5 < e/2{Ri + R2 + 3e), we consider four cases. 

Casel:ti^„ > n^^^,t2,„ > n^"''. In this case, using the bounds 7V(ii,„, i2,n) < 2"(^(^''*')+<^), Qi < c/^^, Q2 < 
c/ V^2^, we have 

log(P23) < MH{X,Y),Ri+R2,5) 

< -n{Ri+R2-H{X,Y)+e/4) 

< -ne/4. (28) 
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Case II: ti,„ < n^-^ta,™ < n^~^. In this case, using the bounds A^(ti,„, t2,„) < (|3^|n)*''" , Qi < 

P, Q2 < P, we have 

iog(P23) < hm.RuS) + M\yiR2,s) 

c{Ri,e)n c(i?2,e)n 



< 



log n log n 



~ log n ' 

where c(i?i, i?2, e) = c(i?i, e) + c(i?2, e)- 

Case III: ii^„ > n^^^ t2,« < n^"''- In this case, using the bounds iV(ti,„, <2.n) < 2"('f^(^l^)+2<^) (|3^|n)*''" , Qi < 
c/ V^i^j Q2 < P, we have 

log(P23) < MHiX\Y),Ri,d)+M\ylR2,S) 

c{R2,e)n 



< -n{Ri- H{X\Y) + e/4)- 



\ogn 



< c{Ri,R2,i)n 
~ logn 



Case IV: ti^n < ^ 1^2,71 > n . As in Case III, we have 

T fr, \ ^ c{Ri,R2,e)n 

log(P23 < ] ■ (31) 

logn 

From OJ, (|22ll, (|25]l, and (EB, we have, 

^ 2^cn/logn 

for some constant c. 

VII. Universal decoding: Proof of Theorem[7] 

An encoding or decoding operation is said to be universal in a class of sources if the encoding/decoding operation 
can be chosen without the knowledge of the exact source statistics in the class. The encoding for RSWCs without 
time-sharing in Section [yT]results in universal encoding in the class of i.i.d. sources. The two encoders may choose 
to encode at rates Ri and R2 and choose their encoding matrices randomly without the knowledge of the distribution 
of either source. The joint typicality decoding discussed earlier will be able to recover both the sequences with 
exponentially small probability of error as long as the rate pair R2) lies in the Slepian-Wolf rate region of the 
sources. However, though the encoders are universal, the joint typicahty decoding is not universal since it requires 
the decoder to know the joint distribution of the sources. 

In this section, we show that the well known universal minimum entropy decoding (MED) [9] which does not 
need the joint distribution of the sources will also be able to decode our code with exponentially small probability 
of error provided rrii > \n{Ri + 4e)/(0.5 logn)] and m2 > \n,{R2 + 4e)/(0.5 logn)] for some (i?i,i?2) in the 
Slepian-Wolf rate-region of the sources. Here, the decoder finds the pair (x, y) with minimum empirical entropy 
which satisfies the conditions Dix — Ui and Day = U2. If there are more than one such pair then the decoder 
declares a decoding error. 
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Before investigating the probability of error under minimum entropy decoding, let us define a weakly e-typical 
vector (x, y) as one satisfying 

\\og2{p'},^Y(^,y))+nH{X,Y)\<ne, 
\\og2{pl{^))+nH{X)\<ne, and 
\\og,{p^{y))+nH{Y)\<ne. 

The set of weakly e-typical vectors will be denoted by A^^^^ak- A weakly e-typical vector x (similarly y) is defined 
as one satisfying 

\\ogM{^))+nH{X)\<nt. 

The properties of the weakly typical set may be found in [11]. 

Let us denote the joint entropy of the type of a pair of vectors (x, y) as (x, y), the corresponding conditional 
entropies as i/(x|y) and (y|x), and the individual entropies of the vectors as i?(x) and H{y). The probability 
of error of a minimum entropy decoder is bounded as 

P2{MED) <P[+ + + (32) 

where P[ is the probability that (X, Y) is not jointly weakly e-typical, P21 is the probability that there is a 
x' 7^ X such that i/(x',Y) < iJ(X,Y) and IdTx' = Ui, P22 is the probability that there is a y' 7^ Y such 
that i7(X,y') < i/(X, Y) and D^' = U2, and P23 is the probability that there is another pair (x',y') so that 
x' 7^ X,y' 7^ Y,]5T^' = Ui, l5^' = U2 and i/(x',y') < H{X,Y). We will briefly discuss aU the terms in 



By definition, P{ ~ PrjAJ weak}- Since the weakly e-typical set is a superset of the strongly e'(e,px,y)-typical 
set for some e'{e,px,Y) [19], P{ can be bounded similar to ^ as 

P[ < 2^^=" (33) 

where the constant c depends on px,Y- 

Following similar steps as the proof of Lemma [T3] we have 

P22 = Yl Px,Y{^,y)J2K,yit)fmmfp,^)) 

(x,y)eA. t>0 ^ ^ vl// 

Where N^Jt) = |{y' e 3^"|il(y'|x) < i/(y|x), (y, y') = t}\. Now, let us define N'{t) = max(,,y)eA, K,yit) 
for t > 0, and t„ = argmaxt>o (^N'{t) (min (p, c/v^))™^^ Then clearly, 

P22 < nN'{t^) (^min {p, 

Note that for a given weakly typical x, the condition (x, y) G A^^^eak implies iJ(y|x) < H{Y\X) + 2e. So, 
^x,yW ^ l{y' e 3^"|i?(y'|x) < H{Y\X) + 2e,dH(y,y') = t}\. So, we can use both the bounds N'{tn) < 
2n(_f/(y|x)+3e) ^jjjj 7v'(t^) < (|3/'|?i)*" for large enough n. Then it can be shown in the same way as in the proof 
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of Lemma [T3] that P22 < exp(— cn/ logn) for m2 > [n(i?2 + 4e)/(0.5 logn)] . Similarly it can be shown that 
P21, P23 < exp {—cn/ logn) for large enough n if mi > \n{Ri + 4e)/(0.5 logn)] and m2 > \n{R2 + 4e)/(0.5 logn)] 
for a rate pair i?2) in the Slepian-Wolf rate-region. Since P{ goes to zero exponentially as in (|33] |. it follows 
that 

{MED) < exp (-cn/ log n) 

for large enough n for some constant c. 

VIII. Generalization to other source networks: Proof of Theorem[8] 

The most simple generalization of the Slepian-Wolf source network is to multiple sources as shown in Fig. |2] 
The same proof technique can be used to show that the decoder can recover all the sources with exponentially 
small probability of error if the encoders do random real encoding at rates satisfying 

J2R^>H{Xc\Xc^) 

for each C C {1,2, ■ ■ ■ , k}. Here denotes the complement of C Using the same proof technique as outlined in 
Sec. IVIII one can show that the decoder can also do minimum entropy decoding to attain vanishing probability of 
error. 



Xi 




Encoder 1 








X2 




Encoder 2 




^ 




Xk 


Encoder k 





Decoder 



Fig. 2. A simple multi-source network 



Csiszar and Korner [9] extended the result of Slepian and Wolf to more general source networks called normal 
source networks (NSN) without helpers. In the following, we briefly discuss their source network and argue that 
our coding technique can achieve the achievable rate-region of NSN without helpers. 

Let A, B and C denote the set of sources, encoders and decoders respectively in the network. For any c e C, let 
Sc denote the set of source nodes from which information is received at the decoder node c. Let Vc denote the set 
of sources which are to be reproduced at c. 

An NSN, as defined in [9] and an example of which is shown in Fig. [51 is a source network where 

(i) there are no direct edges from the sources to the decoders. 
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Xi " Encoder 1 



X2 




Encoder 2 












^ 


Encoder k 




Fig. 3. Normal Source Network 



(ii) |.4| = \B\ and the edges from Ato B define a one-to-one correspondence between the sources and encoders, 

(iii) all the sets 5c, c G C are different, and 

(iv) for each pair of output vertices c' and c", the inclusion Sc' C Sc" implies Vc' C Vc". 

For a source a E A, let Xa denote the i.i.d. data generated by the source. Similarly, for a subset CCA, let Xc 
denote the vector {Xa)aec- A source a in an NSN is called a helper if for some c E C, a E ScX'Dc- Clearly, a 
source network without helpers satisfy Sc = for all c E C. For any encoder b E B, let Rb denote its encoding 
rate. For a source network without helpers, Csiszar and Korner characterized the rate-region. 

Theorem 14: [9] The achievable rate-region of an NSN without helpers equals the set of those vectors R = 
{Rb}beB which satisfy the inequalities 

J2Rb>H{Xc\Xs^\c) (34) 

bee 

for every output c E C and set £ C 5c. 

The achievability proof of this rate-region reduces to the achievability proof of the corresponding rate-region for 
each of the networks obtained by taking all the sources and one decoder. In other words, if the encoders encode at 
rates satisfying the conditions in Theorem [141 the probability of eiTor for each decoder is negligible. So the proof 
reduces to the proof for the multiple source network as shown in Fig. |2] It thus follows that the rate-region of 
any NSN without helpers is achievable by random real encoding at each encoder. Moreover, the rate-region is also 
achievable with minimum entropy decoders. 

IX. Conclusion 

The Real Slepian-Wolf Codes analyzed here provide a novel achievability proof of the Slepian-Wolf theorem. 
Perhaps just as importantly, they demonstrate the intriguing possibility of design of information-theoretic codes 
via convex optimization techniques. For instance, since decoding RSWCs is equivalent to solving an optimization 
problem, it is natural to consider similar "real" codes for problems where some function of the code simultaneously 
needs to be optimized. We are currently investigating the performance of RSWCs under more structured choices 
of encoding matrices, with the hope of obtaining codes for which IP decoding is equivalent to LP decoding, and 
is therefore computationally tractable. 
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Appendix 
Proof of Lemma |9] 

First consider -P?' {X]"=i > A}. We define E = {{wi,W2, - ■ ■ ,Wn)\^"^iWi > A}. Let pw denote the 
probability mass distribution of Wi. Then 

Prj^VK, >a| = Pr{E} 

= Pr |pn|Mp„ > ^1 ■ 

Here Pn denotes the type of {wi , W2, • ■ ■ , Wn) and /ip^ denotes the mean of p„. By Sanov's Theorem [11, Theorem 
12.4.1], we have 

where p* — argminp^.^^ >A/n D{pn\\pw)- Since p^ has zero mean, the "nearest" distribution to p^ that has mean 
greater than A/n in absolute value would differ from p^ in the largest absolute component by at least A/ (an). So, 
/ip. > A/n implies |p* -p^,|i > A/{an). We then have D{p*J\p^) > (l/21n2)|p,* - Pw\l > A^ / {2{na)'^ \n2) 
by [11, Lemma 12.6.1]. So, 

Prl^±W.>A^ < („+l)IWIexp(^-^^ 
Similarly one can show that Pr {X;r=i < < + 1)'^' exp A'^ / {2na'^)) . So the resuh follows. □ 

Acknowledgments 

The authors gratefully acknowledge support from the CUHK direct grant, the CU-MS-JL grant, and a grant from 
the Bharti Centre for Communication. We would like to thank S. Shenvi for his interest and involvement in several 
stages of this work. We would also like to thank D. Manjunath for fruitful discussions. 

References 

[1] S. Shenvi, B. K. Dey, S. Jaggi, and M. Langberg, ""Real" slepian-wolf codes," in IEEE International Symposium on Information Theory 

(ISIT), (Toronto, Canada), July 2008. 
[2] D. Slepian and J. K. Wolf, "Noiseless coding of correlated information sources," IEEE Transactions on Information Theory, vol. 19, 

pp. 471-480, July 1973. 

[3] S. Pradhan, J. Kusuma, and K. Ramchandran. "Distributed compression in a dense microsensor network," IEEE Signal Processing Magazine, 
vol. 19, pp. 51-60, March 2002. 

[4] I. Csiszar and P. Narayan, "Common randomness and secret key generation with a helper," IEEE Transactions on Information Theory, 
vol. 46, pp. 344-366, Mar. 2000. 

[5] R. Puri and K. Ramchandran, "Prism: a new robust video coding architecture based on distributed compression principles," in Proceedings 

of the Allerton Conference on Communications, Control, and Computing, October 2002. 
[6] A. J. Hoffmann, "The role of unimodularity in applying linear inequalities to combinatorial theorems," Annals of Discrete Mathematics, 

vol. 4, pp. 73-84, 1979. 

[7] I. Csiszar, "Linear codes for sources and source networks: error exponents, universal coding," IEEE Transactions on Information Theory, 
vol. 28, no. 4, pp. 585-592, 1982. 



October 8, 2008 



DRAFT 



DEY, JAGGI, AND LANGBERG: "REAL" SLEPIAN-WOLF CODES 



20 



[8] E. Candfes, J. Romberg, and T. Tao, "Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency 

information," IEEE Transactions on Information Theory, vol. 52, pp. 489-509, February 2006. 
[9] I. Csiszar and J. Komer, "Towards a general theory of source networks," IEEE Transactions on Information Theory, vol. 26, no. 2, 

pp. 155-165, 1980. 

[10] C. E. Shannon, "A mathematical theory of communication," Bell Systems Technical Journal, vol. 27, pp. 379^23,623-656, 1948. 
[11] T. Cover and J. Thomas, Elements of Information Theory. John WUey and Sons, 1991. 

[12] J. Garcia-Frias and Y. Zhao, "Compression of correlated binary sources using turbo codes," IEEE Communication Letters, pp. 417-419, 
October 2001. 

[13] T. P. Coleman, A. H. Lee, M. Medard, and M. Effros, "On some new approaches to practical slepian-wolf compression inspired by channel 

coding," in Proceedings of the Conference on Data Compression, p. 282, March 2004. 
[14] D. Donoho, "Compressed sensing," IEEE Transactions on Information Theory, vol. 52, pp. 1289-1306, April 2006. 
[15] E. Candfes and T. Tao, "Decoding by Unear programming," IEEE Transactions on Information Theory, vol. 51, pp. 4203-4215, December 

2005. 

[16] R. Urbarike and B. Rimoldi, "Lattice codes can achieve capacity on the AWGN channel," IEEE Transactions on Information Theory, 

vol. 44, no. 1, pp. 273-278, 1998. 
[17] W. Feller, An Introduction to Probability Theory and Its Applications, Volume II (2nd ed.). New York: John Wiley & Sons, 1972. 
[18] R. B. Ash, Information Theory. New York: Dover Publications, Inc., 1965. 

[19] R. W. Yeimg, Information Theory and Network Coding. Available at http://www.springerlink.com/content/978-0-387-79233-0: Springer. 



October 8, 2008 



DRAFT 



